A Backcasting Approach for Anomaly Detection in Time Series Data

44th International Symposium on Forecasting, Dijon, France

Priyanga Dilini Talagala

July 1, 2024

Anomalies in Temporal Data

Dengue Outbreak

Weekly Dengue Cases in Gampaha District, Sri Lanka

Data Source: https://denguedatahub.netlify.app/

Weekly Dengue Cases in Sri Lanka

Weekly Dengue Cases in Sri Lanka

Daily COVID-19 Confirmed Cases

Outbreak

An occurrence of a disease in a specific geographic area that is significantly higher than the established baselines. This increase can be either sudden or gradual.

Outbreak

An occurrence of a disease in a specific geographic area that is significantly higher than the established baselines. This increase can be either sudden or gradual.

What is an Anomaly ?

We define an anomaly as an observation that is very unlikely given the backcasted distribution.

An anomaly is an observation that exhibits a significant deviation from the established typical behavior.

Methodology

  • Backcasting is a planning method that starts with defining a desirable future and then works backwards to identify policies and programs that will connect that specified future to the present.

Methodology

  • Backcasting is a planning method that starts with defining a desirable future and then works backwards to identify policies and programs that will connect that specified future to the present.

  • This approach allows us to strategically assess how current or future observations fit into historical trends and influences.

Off-line Phase

  • Build a model of a system’s typical behaviour

  • Use the Exponential Smoothing State Space model with low smoothing parameters for the level and slope, and a high dampening parameter for the slope, emphasizing recent observation influence in backcasting.

Move the window one step ahead with each new data point

For each new data subset reinitialize the model state with new data without changing the estimated parameters.

Generating one-step backward projections using a refitted backcasting model.

Compare the backcasted values with the actual observed values.

Block Maxima Method for Anomalous Threshold Calculation

  • Select error data from the typical behaviour

  • Divide error data into blocks and extract block maxima and minima

  • Apply Generalized Extreme Value distribution to the block maxima and minima to model extreme error values

  • Determine the 95th percentile (upper threshold) and 5th percentile (lower threshold) of the GEV distribution